A Hybrid Algorithm for Recognizing the Position of Ezafe Constructions in Persian Texts

نویسندگان

  • Samira Noferesti
  • Mehrnoush Shamsfard
چکیده

17- Abtract — In the Persian language, an Ezafe construction is a linking element which joins the head of a phrase to its modifiers. The Ezafe in its simplest form is pronounced as –e, but generally not indicated in writing. Determining the position of an Ezafe is advantageous for disambiguating the boundary of the syntactic phrases which is a fundamental task in most natural language processing applications. This paper introduces a framework for combining genetic algorithms with rule-based models that brings the advantages of both approaches and overcomes their problems. This framework was used for recognizing the position of Ezafe constructions in Persian written texts. At the first stage, the rule-based model was applied to tag some tokens of an input sentence. Then, in the second stage, the search capabilities of the genetic algorithm were used to assign the Ezafe tag to untagged tokens using the previously captured training information. The proposed framework was evaluated on Peykareh corpus and it achieved 95.26 percent accuracy. Test results show that this proposed approach outperformed other approaches for recognizing the position of Ezafe constructions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Probabilistic Approach to Persian Ezafe Recognition

In this paper, we investigate the problem of Ezafe recognition in Persian language. Ezafe is an unstressed vowel that is usually not written, but is intelligently recognized and pronounced by human. Ezafe marker can be placed into noun phrases, adjective phrases and some prepositional phrases linking the head and modifiers. Ezafe recognition in Persian is indeed a homograph disambiguation probl...

متن کامل

Lessons from building a Persian written corpus: Peykare

This paper addresses some of the issues learned during the course of building a written language resource (called ‘Peykare’) for contemporary Persian. After defining five linguistic varieties and 24 different registers based on these linguistic varieties, we collected the texts for Peykare to do a linguistic analysis, including cross-register differences. For tokenization of Persian, we have pr...

متن کامل

Persian Ezafe as a 'figure' Marker: a Unified Analysis

This article is a conceptual exploration of Ezafe in Modern Persian. I will consider cases where Ezafe seems to be conceptually non-neutral. In certain cases of the ‘X-e Y’ construction, X and Y can change their places with a shift in meaning while they are apparently frozen in their positions in other cases of Ezafe construction. The question to address here is if the Ezafe element -e marks an...

متن کامل

On the Importance of Ezafe Construction in Persian Parsing

Ezafe construction is an idiosyncratic phenomenon in the Persian language. It is a good indicator for phrase boundaries and dependency relations but mostly does not appear in the text. In this paper, we show that adding information about Ezafe construction can give 4.6% relative improvement in dependency parsing and 9% relative improvement in shallow parsing. For evaluation purposes, Ezafe tags...

متن کامل

Design and implementation of Persian spelling detection and correction system based on Semantic

Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors.  Also developing Persian tools will provide Persian progr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJIMAI

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2014